More Informative Open Information Extraction via Simple Inference

نویسندگان

  • Hannah Bast
  • Elmar Haussmann
چکیده

Recent Open Information Extraction (OpenIE) systems utilize grammatical structure to extract facts with very high recall and good precision. In this paper, we point out that a significant fraction of the extracted facts is, however, not informative. For example, for the sentence The ICRW is a non-profit organization headquartered in Washington, the extracted fact (a non-profit organization) (is headquartered in) (Washington) is not informative. This is a problem for semantic search applications utilizing these triples, which is hard to fix once the triple extraction is completed. We therefore propose to integrate a set of simple inference rules into the extraction process. Our evaluation shows that, even with these simple rules, the percentage of informative triples can be improved considerably and the already high recall can be improved even further. Both improvements directly increase the quality of search on these triples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discrimination of the Heart Ventricular and Atrial Abnormalities via a Wavelet-Aided Adaptive Network Fuzzy Inference System (ANFIS) Classifier

The aim of this study is to address a new feature extraction method in the area of the heart arrhythmia classification based on a metric with simple mathematical calculation called Curve-Length Method (CLM). In the presented method, curve length of the under study excerpted segment of signal is considered as an informative feature in which the effect of important geometric parameters of the ori...

متن کامل

Filtering Information Extraction via User-Contributed Knowledge

Large repositories of knowledge can enable more powerful AI systems. Information Extraction (IE) is one approach to building knowledge repositories by extracting knowledge from text. Open IE systems like TextRunner [Banko et al., 2007] are able to extract hundreds of millions of assertions from Web text. However, because of imperfections in extraction technology and the noisy nature of Web text...

متن کامل

Answering Complex Questions Using Open Information Extraction

While there has been substantial progress in factoid question-answering (QA), answering complex questions remains challenging, typically requiring both a large body of knowledge and inference techniques. Open Information Extraction (Open IE) provides a way to generate semi-structured knowledge for QA, but to date such knowledge has only been used to answer simple questions with retrievalbased m...

متن کامل

Simulating Ratios of Normalizing Constants via a Simple Identity: a Theoretical Exploration

Let pi(w), i = 1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) = qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in f...

متن کامل

Statistica Sinica 6(1996), 831-860 SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION

Let pi(w); i = 1; 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) = qi(w)=ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1=c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014